Method and apparatus for controlling processor operation speed

ABSTRACT

The present invention registers execution modules in association with operating speed attribute data by analyzing code containing operating speed of each of the execution modules as attribute data, groups the registered execution modules by operating speed based on the associated operating speed attribute data and creates a file header containing attributes of each group, upon loading an executable file containing said file header into memory at the time of execution; associates the operating speed attribute data with an address range of the loading for each execution module in the executable file, and controls the operating speed of the processor executing an execution module according to the operating speed attribute data associated with the address of the execution module when the module is executed.

BACKGROUND OF THE INVENTION

[0001] The present invention relates generally to a technique forreducing power consumption by a processor, and more particularly to sucha technique for reducing power consumption by allowing clock speed ofthe processor to be varied.

[0002] Computers, particularly personal computers, have been widely usednot only in enterprises but also in homes and schools. The performanceof microprocessors used in the personal computers has been enhanceddramatically, and it is not unusual for the operating clock to exceed 1GHz today. As the operating clock increases, however, both powerconsumption and heating increase, and a problem to be solved for highspeed microprocessors is how to suppress such power consumption andheating. Notebook computers in particular are operated with batterypower, and reducing power consumption is therefore even more importantwith notebook computers than with desktop computers.

[0003] In view of the above, some techniques have been proposed to varya clock speed of a processor to reduce power consumption and heating inthe processor. Such proposals are based on the understanding that notall programs need to run at the same clock speed. Japanese PatentPublication H8-76874, for example, teaches a CPU clock control deviceand method for which one or more performance information settingcircuits for setting CPU performance information required for each task,a selection information generating circuit for determining the clockfrequency of the CPU so that the CPU operates at the minimum performancelevel required by a task started, an oscillation circuit for generatinga plurality of clock signals, and a clock selecting circuit forselecting one of the clock signals and providing it to the CPU. Thisclock control is aimed at operating the CPU with low power consumptioneven at the time of task execution in a multitasking environmentdepending on a program to be executed in the CPU, by switchingautomatically to the slowest CPU clock at which performance requirementof a program to be executed is satisfied to thereby reduce powerconsumption.

[0004] U.S. Pat. U.S. 6,138,232 (Japanese Patent Publication H10-228383)teaches a microprocessor and operating method therefor in which aninterrupt from one of a plurality of interrupt sources is accepted tochange from operating on a current task to operating on a priority task,and a rate of microprocessor instruction operation during operation inresponse to an interrupt is set depending upon the interrupt sourceproducing the interrupt. According to this teaching, a rate table ofinterrupt source to instruction operation is provided, which is accessedupon receipt of an interrupt to obtain a rate of instruction operationcorresponding to the interrupt source.

[0005] As seen from the above, the prior art has sought reduction ofpower consumption and heating by changing the operating speed (clock) ofa processor depending on a task to be executed or a type of interrupt tobe processed.

[0006] The conventional techniques change the clock speed based on apredetermined task or interrupt, which may not achieve optimum operationin the case where a task comprises a mixture of codes some of whichrequire high speed operation while the other can run at a slower speed.In order to achieve the optimum operation aiming at the reduction ofpower consumption and heating, more sophisticated control, specificallybased on an execution address and hence execution code of a program isrequired.

SUMMARY OF THE INVENTION

[0007] Therefore, a purpose of the present invention is to provide moresophisticated power management by controlling clock speed of a processorbased on an execution address and hence execution code.

[0008] A further purpose for the invention is to allow operation clocksto be specified for a program upon producing the program in order toachieve the above power management.

[0009] According to a first aspect of the present invention, a methodfor controlling operating speed of a processor is provided whichcomprises the steps of registering execution modules in association withoperating speed attribute data by analyzing code containing operatingspeed of each of the execution modules as attribute data, grouping theregistered execution modules by operating speed based on the associatedoperating speed attribute data and creating a file header containingattributes of each group, upon loading an executable file containing thefile header into memory at the time of execution, associating operatingspeed attribute data with an address range of the loading for eachexecution module in the executable file, and controlling the operatingspeed of the processor executing an execution module according to theoperating speed attribute data associated with the address of theexecution module when the module is executed.

[0010] According to a second aspect of the present invention, anapparatus for controlling operating speed of a processor is providedwhich comprises means for registering execution modules in associationwith operating speed attribute data by analyzing code containingoperating speed of each of the execution modules as attribute data,means for grouping the registered execution modules by operating speedbased on the associated operating speed attribute data, and creating afile header containing attributes of each group, means for, upon loadingan executable file containing the file header into memory at the time ofexecution, associating operating speed attribute data with an addressrange of the loading for each execution module in the executable file,and means for controlling the operating speed of the processor executingan execution module according to the operating speed attribute dataassociated with the address of the execution module when the module isexecuted.

BRIEF DESCRIPTION of the DRAWINGS

[0011] Some of the purposes of the invention having been stated, otherswill appear as the description proceeds, when taken in connection withthe accompanying drawings, in which:

[0012]FIG. 1 is a block diagram showing an exemplary hardwareconfiguration of a data processing system incorporating a processorwhose operating speed is controlled according to the present invention;

[0013]FIG. 2 is a block diagram showing the basic configuration of thepresent invention;

[0014]FIG. 3 shows a portion of source code containing specialdescriptors for controlling the operating speed according to the presentinvention;

[0015]FIG. 4 shows a result of analyzing the source code shown in FIG. 3by a compiler;

[0016]FIG. 5 shows how a linker rearranges object codes;

[0017]FIG. 6 shows an exemplary file header created by the linker;

[0018]FIG. 7 is a block diagram showing an example of operating speedcontrol using a virtual address conversion mechanism;

[0019]FIG. 8 is a block diagram showing an exemplary registerconfiguration for enabling the operating speed control in a processorthat does not have a virtual address conversion mechanism;

[0020]FIG. 9 is a block diagram showing an exemplary circuit fordecoding operating speed control bits and generating a selected clocksignal;

[0021]FIG. 10 is a block diagram showing address comparison for clockcontrol in a processor that does not have a virtual address conversionmechanism;

[0022]FIG. 11 is a block diagram showing an exemplary circuit forgenerating a variable clock signal; and

[0023]FIG. 12 is a block diagram showing an exemplary circuit forsupporting an override clock instruction in the variable clock signalgenerating circuit shown in FIG. 11.

DETAILED DESCRIPTION OF THE ILLUSTRATIVE EMBODIMENTS

[0024] While the present invention will be described more fullyhereinafter with reference to the accompanying drawings, in which apreferred embodiment of the present invention is shown, it is to beunderstood at the outset of the description which follows that personsof skill in the appropriate arts may modify the invention here describedwhile still achieving the favorable results of this invention.Accordingly, the description which follows is to be understood as beinga broad, teaching disclosure directed to persons of skill in theappropriate arts, and not as limiting upon the present invention.

[0025] Referring now more particularly to the accompanying drawings,FIG. 1 shows an exemplary hardware configuration of a data processingsystem incorporating a processor whose operating speed is controlledaccording to the present invention. While the data processing system 100shown in FIG. 1 is assumed to be a personal computer of either notebookor desktop type, the present invention is not limited to such a personalcomputer and may be applied to any computer which requires reduction ofpower consumption.

[0026] In the data processing system 100 shown in FIG. 1, a processor102 whose operating clock is controlled according to the presentinvention and main memory 104 are connected to a PCI bus 106 through aPCI bridge 108. The PCI bridge 108 may contain an integrated memorycontroller/cache memory for the processor 102. In addition to the PCIbridge 108, other components such as a LAN adapter 110, SCSI host busadapter 112, expansion bus interface 114, audio card adapter 116, andgraphics card adapter 118 can be connected to the PCI bus 106. Theexpansion bus interface 114 provides additional connection for akeyboard/touchpad 120, modem 122, and additional memory 124 while theSCSI host bus adapter 112 provides additional connection for a hard diskdrive 126 and CD-ROM drive 128.

[0027] An operating system and application programs run by the processor102 are stored, for example, in the hard disk drive 126, and read intothe main memory 104 as needed.

[0028] Described next with reference to FIG. 2 to FIG. 12 is a techniquefor varying the clock speed of the processor 102 for each executablecode of an application program according to the present invention whenthe application program is executed under the control of the operatingsystem in a data processing system such as shown in FIG. 1.

[0029]FIG. 2 shows the basic configuration of the present inventionincluding a source code 202 which describes a program containingoperating speeds of executable modules (functions) as attribute data, acompiler 204 which registers the executable modules in association withthe operating speed attribute data by analyzing the source code 202, alinker 206 which groups the registered execution modules by operatingspeed based on the associated operating speed attribute data and createsa file header containing the attribute of each group, an OS managementprogram 208 which, upon loading an executable file containing the fileheader into memory at the time of execution, stores correspondingoperating speed attribute data in a predetermined operating speed datastorage for each execution module in the executable file, a programexecution manager 210 which reads corresponding operating speed datafrom the operating speed data storage when an execution module isexecuted, and a clock controller 212 which controls the operating speedof the processor 102 executing the execution module, based on the readoperating speed data. The source code 202, compiler 204, and linker 206may be implemented in either the data processing system 100 shown inFIG. 1 or a different data processing system (using a cross compiler ifthe processor architecture is different). The OS management program 208may be implemented as part of the operating system running in the dataprocessing system 100 shown in FIG. 1. The program execution manager 210and clock controller 212 may be implemented as internal functions of theprocessor 102.

[0030] A portion of the source code 202 relating to the presentinvention is shown in FIG. 3. The source code 202 shown in FIG. 3 iswritten in C, but a different language may also be used. In FIG. 3,“#pragma” is a descriptor referred to as pragma directive which allows aprogrammer to define the attributes of the corresponding function. InFIG. 3(A), FunctionH() is defined as a function running at high speed(hereinafter referred to as “high speed function”) which is loaded in anaddress area for high speed execution, and FunctionL() is defined as afunction running at low speed (hereinafter referred to as “low speedfunction”) which is loaded in an address area for low speed execution.As shown in FIG. 3(B), processor instructions for overriding theoperating speed settings by the pragma directive may be used. In theexample shown in FIG. 3 (B), FunctionM() defined as a function runningat medium speed (hereinafter referred to as “medium speed function”) isswitched to a high speed clock by a processor instruction “highclk”during normal processing, and switched back to the original clock byanother processor instruction “rstclk”. These processor instructions canbe executed by providing, for example, processor specific libraryfunctions.

[0031] With the pragma directive shown in FIG. 3, programmer's knowledgerelating to the optimum operating clock for code can be reflected inobject code for each of the execution modules therein. While each of thehigh, medium, and low speed operations is explicitly specified in FIG.3, it is also possible to define a particular speed (e.g., high speed)as a default and explicitly specify only the other speeds (e.g., mediumand low). Also, while the number of available speeds is three in thepresent embodiment, it may be two, i.e., high and low, or four or more.

[0032] As shown in FIG. 4, the compiler 204 creates object code and afunction registration table containing the operating speed dataspecified by the pragma directives as attribute data by analyzing thesource code 202. It should be easy for one skilled in the art to allowthe compiler 204 to analyze the pragma directives relating to theoperating speed. The structure shown in FIG. 4 includes a functionregistration section 402 registering data for each function analyzed bythe compiler 204, a function name section 404 registering the names ofthe registered functions, a function attribute section 406 registeringattributes such as clock speed and code size of each function, and anobject code section 408 containing the object code of each function. Thefunction registration section 402 stores a name pointer (NamePtr) 402A,attribute pointer (AttrPtr) 402B, and object pointer (ObjPtr) 402C foreach registered function. The name pointer 402A and attribute pointer402B specify corresponding locations in the function name section 404and function attribute section 406, respectively.

[0033] The object pointer 402C in the function registration section 402specifies the start address of the object code of the correspondingfunction in the object code section 408. In the example shown in FIG. 4,the object code section 408 includes an area 408A containing the objectcode of the high speed function, FunctionH, and an area 408B containingthe object code of the low speed function, FunctionL.

[0034] The registering operation performed by the compiler 204 togenerate the structure shown in FIG. 4 is described next using thesource code shown in FIG. 3 (A) as an example. The compiler 204 analyzesthe first pragma directive “#pragma (HIGH_SPEED, FunctionH)” andregisters FunctionH as a first function “Function#1 ” in the functionregistration section 402. It also registers the function name“FunctionH”, its attributes (including high speed attribute, code size),and object code in the function name section 404, function attributesection 406, and object code section 408A, respectively, and sets thename pointer 402A to the function name section 404, attribute pointer402B to the function attribute section 406, and object pointer 402C tothe object code section 408A, in the function registration section 402.The compiler 204 performs the same registering operation for the secondpragma directive “#pragma (LOW_SPEED, FunctionL)”. While the functionscontaining arguments arg1, arg2, . . . in FIG. 3 are defined as a voidtype which does not return any value, other data types may also be used.

[0035] The linker 206 which collects the object codes in the same blocksas much as possible based on the table structure created by the compiler204 in FIG. 4 is described next. The linker groups the functionsregistered in the object code section 408 in FIG. 4 by operating speed.In the example shown in FIG. 5, the linker collects the high speedfunction code 1 (FunctionH1) in the object code 1 and the high speedfunction code 2 (FunctionH2) in the object code 2, as the high speedfunction group 502; puts the medium speed function code 1 (FunctionM1)in the object code 2 into the medium speed function group 504; andcollects the low speed function code 1 (FunctionL1) in the object code 1and the low speed function code 2 (FunctionL2) in the object code 2, asthe low speed function group 506. In this way, by grouping the generatedobject codes depending on the modules having the same operating speeds,the linker avoids fragmentation of execution address ranges segmentedfor the respective operating speeds. The linker then creates a fileheader as shown in FIG. 6 based on such group data.

[0036] The file header 602 has one entry per group as shown in FIG. 6.Each entry contains an object pointer 602A to a corresponding functiongroup, group size data 602B, and clock speed data 602C (high, medium, orlow). The group object pointer 602A specifies the start address of thehigh speed function group 604, medium speed function group 606, or lowspeed function group 608.

[0037] After the linker 206 creates the file header, the OS managementprogram 208 reads the smallest execution module and its clock speed datafrom the file header, reserves an area for loading the module in orderto obtain the module address so that the module is executed at thespecified operating speed, and stores the clock speed data in the clockspeed data storage corresponding to that address area. The operation ofthe OS management program 208 is described next using the virtualaddress conversion mechanism shown in FIG. 7 as an example.

[0038] A known virtual address conversion mechanism is shown in the tophalf of FIG. 7 which converts a virtual (logical) address 706 to aphysical address 708 using a page directory (or segment table) 702 and apage table 704. The virtual address 706 consists of a page directorynumber D, a page table-number T, and an offset. The page directorynumber D is added to a base address stored in a page directory baseaddress register (PDBR) 710 to select a specific page directory entry(PDE). The selected PDE is used as a base address of the page table 704to which the page table number T is added to select a specific pagetable entry PTE. The PTE contains a frame (page address) which isconcatenated with the offset in the virtual address 706 to form thephysical address 708.

[0039] The bottom half of FIG. 7 shows the improvement of this virtualaddress conversion mechanism according to the present invention.Specifically, control bits indicating the clock speed data are added toeach entry of the page table 704. In the example shown in FIG. 7, thereare two control bits where “11”, “10”, and “01” represent high, medium,and low speed operations, respectively.

[0040] The OS management program 208 reads the smallest executionmodules and corresponding clock speed data from the file header createdby the linker 206, allocates execution modules having the same operatingspeed to a specific page area in the physical address map 712, and setsthe clock speed control bits in the page table entry corresponding tothis page area according to the clock speed data read from the fileheader. In the example shown in FIG. 7, the OS management program 208allocates the high speed functions “FunctionH1” and “FunctionH2” to thefirst page area 712A, allocates the medium speed function “FunctionM1”to the second page area 712B, and allocates the low speed functions“FunctionL1” and “FunctionL2” to the third page area 712C. The OSmanagement program 208 also sets the clock speed control bits in thepage table entries for the page areas 712A, 712B, and 712C to “11”,“10”, and “01”, respectively. Thus, the first page area 712A is definedas a high speed clock page area, the second page area 712B as a mediumspeed clock page area, and the third page area 712C as a low speed clockpage area.

[0041] If the processor does not have a virtual address conversionmechanism such as the one shown in FIG. 7, range registers 802,804, and806 for storing plural execution address ranges and corresponding clockspeed control bits may be provided as external I/O registers or internalprocessor registers as shown in FIG. 8. The high speed clock rangeregister 802 stores the start address “Addr_H” and length “Length_H” ofthe address area 808A in the physical address map 808 where the highspeed functions are loaded, and the control bits “11”. The medium speedclock range register 804 stores the start address “Addr_M” and length“Length_M” of the address area 808B in the physical address map 808where the medium speed functions are loaded, and the control bits “10”.The low speed clock range register 806 stores the start address “Addr_L”and length “Length_L” of the address area 808C in the physical addressmap 808 where the low speed functions are loaded, and the control bits“01”. Note that if the registers 802,804, and 806 are fixed as high,medium, and low speed registers, respectively, the control bits are notnecessary, but if the roles of the registers are changed dynamically,then the control bits are required.

[0042] The OS management program 208 may be simplified to take the formof firmware that integrates a function of the OS management program 208and a function of the executable program part, and can directlyreference the address labels of executable codes. In that case, the OSmanagement program function of the firmware may store the control datain the clock speed data storage (control bits) shown in FIG. 7 or FIG. 8by the direct reference to the address labels separating the high,medium, and low speed operation codes. Since the executable program doesnot require the header data in this case, the attribute managementfunction for the clock speed may be omitted from the compiler 204 andlinker 206.

[0043] When the OS management program 208 finishes loading theexecutable file and setting the clock speed data, the program executionmanager 210 reads the clock speed control bits corresponding to theaddress of the code to be executed in order to execute the code in theloaded executable file at the specified operating speed, supplies theclock control data to the clock controller 212, and then executes thecode at the operating speed adjusted by the clock controller 212.

[0044] An example of a circuit for generating the clock control signalssupplied to the clock controller 212 is shown in FIG. 9. The clock speedcontrol bits (two bits in this preferred embodiment) read from the pagetable 704 in FIG. 7 or one of the range registers 802 to 806 in FIG. 8are input to a clock control signal decoder 902. The decoder 902 decodesthe control bits and activates one of the three output linesaccordingly. The output lines correspond to the high, medium, and lowspeed clocks, respectively. In response to the output signal from thedecoder 902 and a valid execution address timing signal, a clock controlsignal enable circuit 904 outputs an appropriate high speed (H), mediumspeed (M), or low speed (L) clock signal. The valid execution addresstiming signal indicates when the address of the code executed by theprocessor 102 is valid on an address bus (not shown).

[0045] If the processor 102 has a virtual address conversion mechanism,the clock speed control bits contained in the page table entry specifiedby the virtual address can be supplied to the decoder 902, but if theprocessor does not have a virtual address conversion mechanism, addresscomparison such as shown in FIG. 10 is necessary. In FIG. 10, theaddress of the code to be executed is supplied to the first inputs ofthree comparators 1002, 1004, and 1006 constituting an addresscomparator. The second input of the comparator 1002 receives the startaddress “Addr_H” and length “Length_H” indicating the address range ofthe high speed clock address area 808A from the range register 802 shownin FIG. 8. The second input of the comparator 1004 receives the startaddress “Addr_M” and length “Length_M” indicating the address range ofthe medium speed clock address area 808B from the range register 804.The second input of the comparator 1006 receives the start address“Addr_L” and length “Length_L” indicating the address range of the lowspeed clock address area 808C from the range register 806. If theexecution address resides in one of these address ranges, the comparatorcorresponding to that address range activates a corresponding input ofthe clock control signal enable circuit 904. In the example shown inFIG. 10, the control bits are not stored in the registers 802 to 806. Asnoted above, however, if the role of each register is not fixed, thecontrol bits indicating the clock speed are needed in which the controlbits stored in a selected register are supplied to the clock controlsignal decoder 902 depending on the output from the comparators 1002 to1006. While the decoder 902 is not-shown in FIG. 10, it is required whenthe control bits are used.

[0046] The clock controller 212 receives the output from the clockcontrol signal enable circuit 904 and supplies a corresponding variableclock signal, which may be constituted by a frequency divider 1102 asshown in FIG. 11. The frequency divider 1102 outputs an external baseclock signal supplied from an oscillator (not shown) as a system clockas it is when the clock control signal enable circuit 904 outputs the Hclock signal, or outputs the system clock divided by two when the Mclock signal is received, or outputs the system clock divided by threewhen the L clock signal is received. Note that these frequency divisionratios 1,1/2, and 1/3 are presented for the illustration purpose only,and other ratios may also be used.

[0047]FIG. 11 shows an example in which the clock speed is not changed(overridden) by a processor instruction. If an override instruction andan instruction for canceling the override are inserted in the sourcecode as shown in FIG. 3(B), however, an additional circuit is requiredto support these instructions. An example of such circuit is shown inFIG. 12. The added circuit is an override signal selector 1202. Thisoverride signal selector 1202 has input terminals 1B, 2B, and 3Breceiving clock signals for overriding, in addition to input terminals1A, 2A, and 3A receiving the same clock signals as those shown in FIG.11. When an override instruction is detected and an enable overridesignal is active, an override clock signal specified by the overrideinstruction is output from an output terminal 1Y, 2Y, or 3Y to thefrequency divider 1102 instead of the clock signals H, M, L.

[0048] In the drawings and specifications there has been set forth apreferred embodiment of the invention and, although specific terms areused, the description thus given uses terminology in a generic anddescriptive sense only and not for purposes of limitation.

1. A method for controlling operating speed of a processor which allowssaid operating speed to be varied depending on an execution module,comprising the steps of: registering execution modules in associationwith operating speed attribute data by analyzing code containingoperating speed of each of the execution modules as attribute data;grouping the registered execution modules by operating speed based onthe associated operating speed attribute data, and creating a fileheader containing attributes of each group; upon loading an executablefile containing said file header into memory at the time of execution,associating operating speed attribute data with an address range of saidloading for each execution module in said executable file; andcontrolling the operating speed of the processor executing an executionmodule according to the operating speed attribute data associated withthe address of said execution module when said module is executed.
 2. Amethod for controlling operating speed of a processor as described inclaim 1, wherein said associating step includes a step for storingoperating speed data indicated by said operating speed attribute data ina predetermined operating speed data storage.
 3. A method forcontrolling operating speed of a processor as described in claim 2,wherein said controlling step reads corresponding operating speed datafrom said operating speed data storage to control the operating speed ofsaid processor.
 4. A method for controlling operating speed of aprocessor as described in claim 2 or 3, wherein said operating speeddata storage is provided in a conversion table of a virtual addressconversion mechanism.
 5. A method for controlling operating speed of aprocessor as described in claim 1, wherein said associating stepincludes a step of storing data indicating said address range inregisters corresponding to the respective operating speeds.
 6. A methodfor controlling operating speed of a processor as described in claim 5,wherein said controlling step includes a step of comparing the addressof said execution module with the data stored in said registers toprovide an operating speed corresponding to the address range containingsaid address.
 7. An apparatus for controlling operating speed of aprocessor which allows said operating speed to be varied depending on anexecution module, comprising: means for registering execution modules inassociation with operating speed attribute data by analyzing codecontaining operating speed of each of the execution modules as attributedata; means for grouping the registered execution modules by operatingspeed based on the associated operating speed attribute data, andcreating a file header containing attributes of each group; means for,upon loading an executable file containing said file header into memoryat the time of execution, associating operating speed attribute datawith an address range of said loading for each execution module in saidexecutable file; and means for controlling the operating speed of theprocessor executing an execution module according to the operating speedattribute data associated with the address of said execution module whensaid module is executed.
 8. An apparatus for controlling operating speedof a processor as described in claim 7, wherein said associating meansincludes an operating speed data storage for storing operating speeddata indicated by said operating speed attribute data.
 9. An apparatusfor controlling operating speed of a processor as described in claim8,.wherein said controlling means reads corresponding operating speeddata from said operating speed data storage to control the operatingspeed of said processor.
 10. An apparatus for controlling operatingspeed of a processor as described in claim 8 or 9, wherein saidoperating speed data storage is provided in a conversion table of avirtual address conversion mechanism.
 11. An apparatus for controllingoperating speed of a processor as described in claim 7, wherein saidassociating means includes plural registers for storing data indicatingsaid address range correspondingly to the respective operating speeds.12. An apparatus for controlling operating speed of a processor asdescribed in claim 11, wherein said controlling means includes means forcomparing the of said execution module with the data stored in saidregisters to provide an operating speed corresponding to the addressrange containing said address.